The Affects on the Growth of Trees

Introduction

Trees are important to our environment as they provide home and food to many different organisms; they also take up carbon dioxiide and release oxygen into our ecosystem. Many different factors can determine trees' growth, but sunlight, water and nutrients are essential for their growth.

This project analyzes trees planted on streets of Metro Vancouver and examines their growth based on different category variables provided.

Question(s) of interests

  1. Do different neighbourhoods have differet affects on trees' growth?
  2. Does the number of trees planted in certain areas have an affect on trees' growth?
  3. Does the presence of root barrier have an affect on trees' growth?
  4. Do different genera of trees have different growth rates?

The Tree Data

The first column does not appear to be useful as it is an identifier, therefore it will be dropped.

The tree dataset has 5000 observations and 21 columns; all columns except date_planted, plant_area, and cultivar_name have full observation. date_planted has more than half of the observation missing, plant_name has 50 missing, and cultivar_name has just under half of the observation missing. The data has 12 string objects, 3 float values, 3 integer values, and 1 datetime value.

Based on the questions of interest, only the columns neighbourhood_name, date_planted, diameter, genus_name, height_range_id, and root_barrier will be used. The other columns will be dropped. The column date_planted is formatted using YYYY-MM-DD. As only the year will be useful, it will be extracted and converted to integer for easier access.

Exploratory Visualizations

Question 1: Do different neighbourhoods have differet affects on trees' growth?

Before diving into the question, let's visualize how many trees are present within each neighbourhood. It could be a good indication that if certain neighbourhoods have more trees, they would have more resources for trees to grow on.

According to the bar graph of the number of trees in each neighbourhood, the neighbourhoods with the most trees planted are Renfrew-Collingwood and Kensington-Cedar Cottage; they each have over 350 trees. The neighbourhood with the least number of trees is Strathcona with less than 100.

Trees grow primarily in their height and length while their thickness is determined by secondary growth. As the growth in their height is the primary growth, I will be only focusing on the column height_range_id.

Let's take a look at the height of the trees in each neighbourhood for each year planted.

From the faceted graph, trees' growth and their neighbourhood do not appear to be correlated with one another.

Question 2: Does the number of trees planted in certain areas have an affect on the trees' growth?

If more trees are planted in certain areas than others, they would need to share and compete for resources, which would slow down their growth. Let's compare the graphs of the average height of trees for each year and the total number of trees planted each year.

Some correlation appears to be presence between the average height of trees' growth and the total number of trees planted each year. In the years where less trees are planted compared to a few years prior or after (1991, 2003), the average height of the trees appears to be taller.

Question 3: Does the presence of root barrier have any affects on tree's growth?

Let's explore how many trees within the dataset have root barriers installed.

The bar graph shows that less than 10% of the tree dataset do not have root barrier installed. Let's now compare the mean height of trees for the presence and absence of root barrier.

Trees without root barriers appear to have higher average height compared to trees with root barriers. They grow twice as much as those with root barriers.

Question 4: Do different genera of trees have different growth rates?

From the above graph, some genera of trees appear to grow taller others. The most obvious appears to be of the genus Platanus as for the years that it appears in, it is located within the top rows.

Conclusion

The categorical variables of interest neighbourhood_name and genus_name have many different unique values and can often get confusing. It would be of best interest to filter the data so that the columns would have only around 10 unique values. Similarly, with the column year_planted, it would be of best interest to filter the data so that the data would only contain trees from 1992 to 2014 since most trees were planted within those years.

The 4 graphs which will be useful in the final project are the first, second, sixth, and seventh graphs as they seem most fitting with the questions of interest.

Resources

  1. Data Vizualization
  2. Street tree data
  3. Trees